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HEW  MEASURES  OF  DIVERSITY 


Manzoor  Ahmad 


ABSTRACT 

The  problem  of  measuring  diversity  within  populations  and  dissimilarity 
or  similarity  between  populations  has  been  extensively  treated  In  the  literature. 
In  this  context  a  general  procedure  called  Analysis  of  Diversity  has  been  outlined 


and  examined  by  C.R.  Rao  in  a  series  of  papers. 


V. 


In  this  paper  we  proposes  .three  new  measuers  of  diversity  and  study  related 


k  k 

Inference  problems.  Denote  by  S  the  simplex  S  ■{*:  ir  »(x^, . . .  ,7^)  * ,  -  1} 


Then  the  proposed  measures  are  of  the  form:  H  (x)  ■  1  -  a  E.x  ♦  (x  ) ,  m»  1,2,3 

\  m  «,  mjjmj 


where  ^(x)  «  (l+k"’1-x)”T,  y  ^0,  d>2 (*)  •  (2-k  Y-xY)  1,  Y^O,  d>3 Cx)  -  (a3+(l-x)y)  1, 


0  <y  <_  1,  and  the  a's  are  suitable  normalizing  constants.  Estimation  of  Hm(x) , 
derivation  of  the  penalty  function  and  cross  entropy  and  the  problem  of  testing 
Independence  have  been  treated.  Asymptotic  distributions  of  relevant  test 
statistics  are  Indicated. 


MCcCJiori  bar 


CiiA&l 


Di  K 


-  /  Q  *  i —  * 

j  Cfr.-rL/iUy  j 
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1.  INTRODUCTION 


The  problem  of  measuring  diversity  within  populations  and  dissimilarity 


or  similarity  between  populations  has  been  extensively  treated  in  the  literature. 


This  problem  arises  in  a  vide  variety  of  domains;  linguistics  (Horvath,  1963; 


Zinger,  1982;  Greenberg,  1956;  Guirand,  1959;  Herdan,  1964,  1966;  Yule,  1944; 


Savchpe,  1964),  sociology  (Agresti  and  Agreed,  1978),  biology  (Sokal  and  Sne^th, 


1963;  Pielou,  1975;  Patil  and  Talllle,  1979),  anthropology  (Rao,  19 7 l£, "T9 7 7b) , 


to  mention  a  few.  An  extensive  bibliography  of  papers  on  m^aertifes  of  diversity 


and  their 


et  al  (1979)  and  Patil  and  Taillie 


(1982). 


1)  Diversity  within  populations  and  dissimilarity  between  populations  have  been 


measured  and  Interpreted  differently.  The  choice  of  a  diversity  measure 


essentially  depends  on  the  context  of  a  problem,  however  any  diversity  measure 


satisfying  certain  basic  conditions  can  be  used  for  pardoning  the  total  varia¬ 


bility  into  a  number  of  additive  components,  each  of  which  can  be  used  to  test  a 
certain  null  hypothesis  or  estimate  a  component  of  the  variability.  Rao  '(1982-, a ,-b ) 


outlined  a  general  procedure  called  Analysis  of  Diversity  (ANODIV)  which  is  similar 


to  the  Analysis  of  Variance  (ANOVA)  for  quantitative  data.  In  this  direction 


Light  and  Margolin  (1971,  1974),  Anderson  and  Landis  (1980)  have  studied  the 


Gini-Simpson  index  of  diversity  while  Nayak  (1984)  has  extended  their  results 


for  Quadratic  Entropy  introduced  by  Rao  (1982, b,c). 


Following  the  general  procedure  of  Rao  (1982, a, b)  any  function  H  defined  on 


Vc  k 

the  simplex  S  •  (x:  x  *  (x^, . . .  ,x^) ' ;  x^  >  0,  Ix^  •  1}  of  the  Euclidean  space  R  , 


is  said  to  be  a  diversity  measure  if  it  satisfies  the  following  conditions 


'  *  .»  j‘  v. 
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(a)  H(ir)  >  0,  '"O',  if  and  only  if  it j  ■  1  for  some  j  and  Vj' 

(b)  H(ir)  <  1  and  *»1'  if  and  only  if  it  is  a  uniform  distribution  i.e. 
irl“ir2“**,“irk.“  k 

It 

(c)  HOO  ia  concave  in  it  on  S  . 

While  condition  (a)  is  natural  and  (b)  is  standard  normalization,  the  condition 
(c)  fulfills  the  requirement  that  the  diversity  in  a  weighted  mixture  of  populations 
should  not  be  smaller  than  the  weighted  sum  of  divers itltes  within  the  individual 
populations.  Glni-Simpson  index  of  diversity,  quadratic  entropy  of  Rao,  Shannon's 
entropy,  a-degree  entropy  of  Renyl  (1961),  a-degree  entropy  of  Havrda  and  Charvat 
(1956),  among  others,  satisfy  conditions  (a),  (b)  and  (c).  (See  Nayak  (1985a)). 

We  consider  three  measures  which  are  of  the  form;  V  ire  S 


k 

(1.1) 

Hm(TT)  -  “-1.2.3 

where 

(1.2) 

aj^-k  Y,  4^00  ■  (1+k  x)  Y,  y>0 

(1.3) 

a?  -  l-k-Y ,  $2(x)  -  (2  -  k”Y-xY)_1,  y>0 

(1.4) 

a3*  (1-  k  1)Y,  <f3(x)  »  (a3+  (l-x)Y)*1,  0  <  Yi.  1 

These  functions  vanish  only  at  the  vertices  e  ^  ,  j-l,2,...,k  of  S  ;  where  the 

probability  vector  e^  represents  a  multinomial  distribution  whose  cell  has 

cell  frequency  one  and  others  zero.  In  section  2,  we  have  shown  that  H^(it)  is 

ic 

concave  for  y  >_0,  ^(tt)  for  y^_1  and  H^(tt)  for  0<  Y^l-  Further,  V  it  e  S 


rh  !vt*« 


'^k\V,AWA:A»;vtv.v  o- 


3 


(1.5) 


H  (n)  >  0,  o- 1,2,3 


follows  from  the  concavity  of  H  since  it  -  Y  ir.e  ,  and  Ett..  ■  1. 

“  jii  i-i  3 

We  have  also  shown  In  section  2  that  these  measures  take  their  maximum  value 
at  ir  •  (l/k,l/k, . . .  ,1/k) ,  the  most  spread  multinomial  population.  Define 


(1.6) 


H  ■  max  H  00 
m  -if  tn  _ 

u  e  SK 


v»<? 


Then  we  have 


(1.7) 


-  (kY-l)/kY,  Y>0 
H2  “  H3  "  1/2‘ 


Further,  since  these  functions  are  symmetric  in  (it.  , . . .  ,1^) ,  they  turn  out  to  be 
Schur-concave  which  is  Indeed  a  desirable  property  for  measuring  variability 
in  a  multinomial  population.  With  such  measures,  the  more  spread-out  the  population 
the  more  diverse  it  turns  out  to  be. 

In  section  3  we  have  treated  the  problem  of  estimating  the  diversity  of  a 

multinomail  population,  based  on  the  measure  H  ;  m -  1,2,3. 

m 

Derivation  of  the  penalty  function(Haberman  (1982))  and  cross  entropy 
(Rao  (1982b))  for  each  of  the  proposed  measures  and  the  problem  of  testing  indepen¬ 
dence  has  been  treated  in  section  4. 


2.  CONCAVITY  OF  THE  MEASURES  H  ;  m-  1,2,3 

m 

From  (1.1)  it  is  obvious  that  the  concavity  of  H  (it)  would  follow  from  the 

m 

convexity  of 


I 


k 


-  1 

1-1 


if  ,<t>  c* 

J  “ 


J 


); 


(2.1) 


m“  1,2,3. 


■y  *" .W  ’J*  .  "*  Jl  ^J*.  *\Jl  w_*  a*. "A  ” 


While  proving  cheir  convexity,  the  functions  I  can  be  treated,  without  loss  of 


generality,  as  functions  of  1Ti*,r2»* '  ‘  *\-i 


-  1-  (1^+. .  .+n  )  ;  ttc  S 


Convexity  of  I^Ctr) 


Since 


rY  - 


1.00  *  l  ir  (1-Ht  -it  ) 
'  J-l  J  J 


k 

[  TT^n, 
j-l  2  J 


(say) 


It  follows  that 


(2.2) 


^  ix  -  n^y-1  - 


where 


*’r3’j 


[i-\-  i'  +  i 


tc  ■  Yn~Y-2[2Cl-4t”1)+  Cr-1) ] 


n  ■  1+  k  1-,Tt* 


For  a  given  vector  ji  -  (d,y  •  *  •  .d^)  ’  »  let  denote  a  diagonal  matrix  with  elements 

2  — 

d,,d,,,...,d  .  Now,  the  matrix  7  I  of  second  order  derivatives  of  I,  takes  form 
1  J-  P  1  -L 


(2.3) 


7  1,  -  D  +t.  11’ 

1  T  k  .. 


with 


5 


6 


2c.  Convexity  of  I.(tt) 

;  - 

Note  that 


(2.6) 


where  b 


I3(")  -  l  iri(b  +irY) 
i"l 


Y*-l 


(l-k~*)Y  and  it  •  1-ir^ ,  and 


(2.7) 


afj  I3  "  Cb-Hfp^+YTTjS^Cb+i])"2 


with 


-  (b+ifY)”^-Y7Tk^k""1^>+^^”^ 


3^-l3-Sk  f*i 


6fc  ■  Y(2-(l+Y)lTt)TTY"2(b+TTY)”2 


.  -  2  -2y-2f.  jj-Yx-3 

+  2y  ir  irt  (b+irt)  . 


Hence 


(2.8) 


7  X3  “  °6  +  5k“?’ 


where  elements  of  are  (<5^, . . .  ,6^_3)  *  is  p.d  iff  0<  y^l, 


2d.  Maxima's  of  the  Functions  H  (tt  ) ;  m-  1,2,3 

m  , _ 

Critical  points  of  H  (it)  for  m»  1,2,3,  are  solutions  of  the  system  of  equations 

m  _ 


(2.9) 


sf-V;)  ■-  ^  V!>  ■ 0:  j-1’2 . k-1- 


These  equations,  when  m«  1,  are  of  the  form,  for  j»  l,2,...,k-l 
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njY+  yd+k  ^  *  k  (const.) 

Hence  one  can  easily  argue  that  the  solutions  oust  satisfy  the  condition 


(2.10) 


or  equivalently 


VV--V1 


nl“1T2",”"7Tk-l  "  k 


since  the  constant  k  is  the  value  of  the  l.h.s.  evaluated  at  n^. 
arguments,  so  turns  out  to  be  the  case  with  H^ir)  and  H-j(w) . 


Through  analogous 


3.  ESTIMATION  OF  H(tr)  ;  m- 1,2,3 

For  Inference  problem,  it  is  essential  to  estimate  a  measure  of  diversity 

A 

H(n).  A  popular  estimate  based  on  sample  proportions  p^  , . .  .  .p^  would  be  H(rt) 
where  ir^  *  Pj,Vj‘  *****  is  also  the  maximum  likelihood  estimator  of  H(ir)(Zehna  (1966)) 

A 

A  Taylor  series  expansion  of  H(tt)  around  tt  allows  us  to  compute  the  asymptotic 

A  A 

variance  of  H(tt)  .  To  do  so  we  first  express  H  (n)  as 
•»  m  „ 


(3.1) 


with 


H  (it)  ■  1  -  [  *  (n.) 

m  -  m  * 


(x)  ■  a  x<t  (x),  m-  1,2,3 
m  mm 


and  treat  H  as  a  function  of  (k~l)  free  variables  ir,  ,ir_,. . .  ,ir,  ,  with  it. 

m  l  t  k—  l  k 

k-1 

nj  m  0  *or  J  “  1*2 . k,  i.e.  Hjn(TT1,n2,...,iTk__1)  -  1  -  Z.  ^("j)  "  ^(i^)  • 

j<vV  si?  V:>  ■  j  sib^y  +  -sb«,(y> 

j-1  J  J  j  j-1  J  J  j  J  k 

lc  A  lc 

■ '  (8sy) 


N-J. 

-  1  *  l  V 

j-1  3 


(3.2) 
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O 

since  ir^"  -1,  Vj,  and  •  Ztt^  -  1.  Now  we  can  see  easily  that  the  asymptotic 

j  A  _  J 

variances  of  H  (it)  ,  denotes  by  a  ,  is  the  variance  of  the  linear  combination 


Q  .. 


7  d  where 
j-1  **  1 


(3.3) 


dmj  "  ^  ^  "  1,2, ...  ,k 


-  a  {<(•  (tt  )+ir  j—  *  (it  )}. 
o  m  J  J  dfTj  m  j 

It  follows  that,  for  m-  1,2  and  3,  the  asymptotic  variance  of  the  estimator 

H  (tt)  of  the  diversity  measure  H  (it)  is 
m  _  m  _ 


(3.4) 


Ir  V 

-2  r  .2  r  r  •  .2 

no  ■  )  ,  d  ,  -  {  >  it  d  J }  , 

®  j-i  1  “J  w  J  “j 


j-1 


The  sequences  {d^}^^,  ^d2j^J*l  31141  ^d3j^j-l  corresPondln8  t0  ^  and  °3 
respectively  are 


(3.5) 


d^  -  a1(l-Hc"1-trj)“Y_1{l4k“1+(Y-l)irj  >, 


and 


d2j  "  a2^2“k  Y-TTY)“2{2-k_Y+(Y-l)TTj  ) 


d3j  -  a3((l-k”1)Y+(l-TTj>Yr2{(l-k‘1)Y+(l+(Y“l)TTj)a-TTj)Y“1}, 


J  —  1*2, * . . ,k. 


-2 

Remark.  Note  chat  no  is  equal  to  the  variance  of  a  random  variable  D  which  cakes 
- — — -  m  tn 

che  value  d  .  with  probability  n  ,  j  -l,2,...,fc. 
mj  j 

Case  y*  1 

Each  of  the  diversity  measures  H  (it),  m-  1,2  or  3,  can  be  seen  as  a  family  of 
measures  since  it  depends  on  a  parameter  of  our  choice  y»  Any  choice  of  y  within 


che  range  of  values  for  which  H  (tt)  remains  concave  would  lead  to  a  specific  measure. 

m  . 

In  practice  as  we  will  see  in  the  sequal  the  choice  of y  would  depend  upon  the  nature 
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of  the  problem.  However  the  choice  ya  1  should  be  emphasized  since  in  this  case 

H  *s  take  a  simpler  form.  Define  the  diversity  measure  G  as 
m  m 

(3.6)  G  (r)  *  H  (tt)  with  y  •  1. 

m  „  m  .. 

Then 

(3.7)  G.(tt)  -  1-k  1  l  ir  (1+k "-it  )  , 

j-1  J  J 

,  k 

G.(ff)  -  G-(tr)-  1-  (l-k"1)  l  fr  (2-k  -ir.) 

j-1  J  J 

and  the  variances  of  the  estimators  G^(ir)  and  G^(n)  respectively  are 

(3.8)  no2^]  -  k"2(l+k"1)2IiTJ{(l+k  ^ 1-rJ)"2-^j(H-k"iTrj)‘2} 
and 

na2[G2l-  (l-k“1)(2-k'1)IirJ(C2-k“:L-7rJ)"2^nJ(2-k"1-Trj)"2}. 

4,  DECOMPOSITION  AND  TEST  OF  INDEPENDENCE 

Consider  a  population  P  of  a  nominal  random  variable  Y  that  assumes  the 

integral  values  J,  1_<  j£k,  which  is  being  viewed  as  a  mixture  of  r  populations 

P  ,P2,...,Pr  of  Y  identified  according  to  r  discrete  levels  of  a  factor  X  of 

.o»  Interest.  Let  ^  -  <«u . .(j . "£k>’  be  the  probability  vector  o {  1 

for  the  population  P;,  and  X£  be  the  mixing  weight  of  ^  for  the  overall  population  P, 

X  >0,  4-  1,2,..., rf  EX  -  1.  Hence  Y  is  assumed  to  follow  a  multinomial  distribu- 

tion  whose  probability  vector  ir.  is  the  mixture  J  X  it  .  Based  on  the  data  classified 

2.-1 

in  the  above  fashion,  we  are  usually  intereseed  in  a  problem  of  prediction  or 
testing  a  hypothesis  of  independence  or  testing  a  hypothesis  HQ :  ^”^2** * '“?r' 

Such  inference  problems  are  handled  through  the  analysis  of  Diversity  (ANODIV) . 
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la  this  regard  Che  following  decomposition,  due  to  Rao  (1982, a, b),  Is  most  natural; 


(4.1) 


H(tr.)  ■  EX^H(ir^)  +  JH( < X ^ } , {tt ^ ) )  . 


The  component  EX  H(ir  )  Is  the  average  diversity  within  the  populations,  and  the 

X»  •»  X 

second  term  designated  as  "Jensen  difference",  defined  by  subtraction,  repre- 

sents  the  diversity  between  populations.  The  concavity  of  H  ensures  that  J  >  0. 

a  • 

An  alternative  but  similar  decomposition,  which  provides  an  Interpretation  of 
Jg,  can  be  obtained  through  the  concept  of  'Penalty  function'  associated  with  a 
diversity  measure.  (JRao  and  Nayak  (1985),) 

A 

Let  A(.j,ir  )  be  the  penalty  (or  the  loss)  to  be  Incurred  in  a  probabilistic 

4r 

prediction  if  a  probability  vector  tt  is  used  for  prediction  and  the  true  category 

*  ,  * 

is  j.  Then  expected  penalty  for  using  ir  is  Eir  A(J,it  ).  If  a  diversity  measure  H 

-  J 

is  strictly  concave  then  there  exists  a  non-negative  and  possibly  infinite  function 
A(j,ir)  such  that 


(4.2) 

and 


(i)  H(ir)  -  Eff^jjCj.n)  V  ttc  S* 


(4.3)  (ii)  H(tt)  -  EiTj Ajj(j  ,tt )  <  ETTjAjj(J,n  ) 

*  k  * 

for  all  it,  tt  £  S  ,  with  equality  only  if  it  -  it  .  The  existence  of  for  every 
strictly  concave  function  H  is  due  to  Haberman  (1982),  and  it  can  be  obtained 
as  follows. 

*  k 

Let  H  be  an  extension  of  H  to  R+  such  that  V  a>_0 


(4.4) 


H  (aw)  -  aH  (tt)  . 


Then  for  it  e  S  with  tt  >  0,  1<  J  <  k,  the  oenalty  function  Au(J,tt)  is  given  by 

J  —  —  •  H 


is  also  expressed  as  (in  analogy  to  variance  decomposition)  SST  -  SSTW  +  SSB. 
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(A. 5) 


4(J'!)  •  jr11* 

j  1 7T 


In  terms  of  the  penalty  function  associated  with  a  strictly  concave  function 
H  the  following  decomposition  of  the  total  diversity  H(tt.)  (or  SST) ,  obtained  by 
Rao  and  Nayak  (1985),  allows  an  Interpretation  of  the  diversity  between  popula¬ 
tions  Ju  (or  SSB) 
n 


(4.6) 


H(ir .)  -  IX^E(ir^)  +  1 WV"*5 


where 


(4.7) 


k  * 

)  -  l  )-dH(j,it)]. 


The  function  C  (•,*).  called  as  the  ’Cross-entropy’  Induced  by  H,  Is  non-negative 
but  not  necessarily  symmetric.  A  more  general  discussion  can  be  found  In  Rao 
and  Nayak  (1985). 

Since  :r^,...,ir  are  associated  with  r  levels  of  a  factor  X,  the  ratio 


(4.8) 


2  SSB 


H  SST 


H(ir.) 


can  be  used  as  a  measure  of  association  between  X  and  the  response  variable  Y. 

*  * 

Now  we  give  the  extension  H  ,  and  penalty  function  A(j  ,H  )  essentially,  needed 

* 

to  compute  the  cross-entropy  (i.e.  measure  the  dissimilarity  between  it  and  it  ) 

for  the  proposed  diversity  measures  H^,  and  . 

*  *  *  k 

Extensions  H^,  and  satisfying  the  condition  (20)  are,  V  ttc  R+ 


(4.9) 


(4.10) 


*  x 

H^tt)  -  l  TTj  [1-  a1(ETTjl)Y{b1ETrA- TTj  }"Y] 
H^*)  -  l  TTj  [1  -a2(ETTjl)Y{b2UTTi)Y-  TTj)”1] 


(4.11) 


H 


*  *  i 

300  -  I^jU-  a3(EirJl)r{b3airJl)Y+  (Ett^-  w^f1] 


where 


bL-  1+k"1,  b2-  2-k"Y  and  b3  -  (l-k“1)Y  - 


a. 


the  '£'  represents  £  . 

£-■1 

The  penalty  functions  A^  Induced  by  the  functions  Hffl,  according  to  (21) ,  turn 


out  to  be.  for  m-  1,2,3, 


(4.12) 


\ 


a.Yir  -a.b.  k  ya,Tr' 


<brV 


1-1  (b^) 


Y+l 


(4.13) 


(J,*) 
H2  * 


~  y  , 

«oY*>a„b 


(b„-trY)  1- 


-  Y  Vi 


Y+l 


'2  J 


i-Kb2-irY;2 


a_(ir  .)”Y  (l+yiO-a.b.,  k  a  ir‘ 

»)  "  -1— 1 - 5 - j ~  +1+1  - : 

(b3-^)2  1-1  (b3-trY)' 


(4.14) 


AH3(J,ir) 


where  y  -  1-y  and  it ^  -  1-tt^  . 


In  the  case  y  ■  1. 


-1 


k  k_1ir2 


(4.15) 


CJtff) ,  Aig^k  +1.  f  . . 

1  -  (l+k  -*  )*  1-1  (1+k' ’ 


and 


(4.16) 


,,  .-1.  k  (l-k**1)*2 

A  CJ  ,*)  *  A  (J  ,ir)  -  l  +  1  +  l  - r-4-  . 

G2  (2-k-1-^)2  1=1  (2-k-1-,2) 


Let  us  now  consider  the  problem  of  testing  the  hypothesis  H*:  tt  -tt  -...-it  . 

w  « X  « *  ^  IT 


Following  Rao  (1982, a, b),  a  test  of  this  hypothesis  can  be  based  on  J  (l.e.  SSB) 


13 


since  under  J^O  in  the  population  and  conversely  J^O  implies 
provided  H  is  strictly  concave. 

Based  on  a  sample  from  a  population  P  of  the  k- dimensional  nominal  r.v.  Y 
divided  into  sub-samples  according  to  r  levels  (r  >_2)  of  the  factor  X,  we  are  going 


to  propose  a  criteria,  based  on  SSB,  to  test  the  null  hypothesis  that  X  and  Y  are 


independent,  i.e.  H^:  .  ."Tj,. 

For  the  ith  level  of  the  factor  X,  i-l,2,...,r,  let  n  be  the  observed 
frequency  for  the  JC^  category  of  Y,  j  *l,2,...,k,  in  a  sample  of  size  n«££n 

ij 

Further  let 


ij 


(4.17) 


n 


i. 


(n 


il* 


ik 


« 


and  p 


The  total,  within  and  between  group  diversities  for  the  sample  are 


(A.  18)  SST  -  H(p 

-  n£ 

SSW  ~  Z  ~  H(p  ) 

D  •  •  X  • 

and 

AAA 

SSB  -  SST  -  SSW. 

A 

Naik  (1985)  has  shown  that  asymptotically,  under  H^,  (i)  SSB  is  distributed  as  a 

A  A  A 

linear  combination  of  independent  x  variables  and  (ii)  SSB  and  SST  are  Independently 
distributed.  For  the  sake  of  completeness  and  for  determining  the  critical  region 
for  testing  Hq,  we  give  the  basic  assusptlons  and  the  main  results  of  Naik  (1985). 

For  a  statistical  analysis  the  sample  vectors  v^,  i«l,2,...,r  are  assumed 
to  be  independently  and  multinomially  distributed  with  parameters  and 


VV- 
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li '  <’V"',\)'‘ 

Let  H  be  a  non-negative,  strictly  concave  and  twice  differentiable  function 
defined  on  R+  satisfying  the  condition  (20) .  Let 


(4.19) 


iVw 


7i*«  *  « 

J  J  — 


be  the  matrix  of  second  order  derivatives  of  H  .  Then,  under  H^j  ir^Vg-. .  .-w  -r 
®1 

as  n.  and — -  -*■  A.  >  0,  asymptotically, 
l.  n. .  l 

k-1  - 

[I]  2n..SSB  ~  -  l  6,x7Cr-l> 

1-1  2  2 

✓  J 
2  2 

where  Xj  (r-l) .  i«l,...»k-l  are  i.i.d.  x  random  variables  with  (r-1)  d.f.  and 
6 j ,  J  -  1,2,. . . ,(k-l)  are  possible  non-zero  eigenvalues  of 


(4.20) 


Vg*(")  *  -  Ag*  (say) 


[II]  SSI  and  SSB  are  Independently  distributed. 


For  a  proof  of  results  (1)  and  (11),  see  Naik  (1985b). 

For  testing  the  null  hypothesis  that  X  and  Y  are  independent,  l.e.  HQ: 

it*-... -it  aglanst  the  general  alternative,  a  natural  criteria,  based  on  analysis 
«  *  «  r 

of  diversity  using  a  diversity  measure  H,  would  be  to  reject  at  level  «,  if 

A 

SSB  2.  c*  choosing  c  such  that 


(4.21) 


P{SSBic|H0)  -  a  . 


The  result  [I]  of  Naik  (1985),  cited  above,  becomes  useful  for  determining  the 
critical  value  c.  For  each  of  the  proposed,  diversity  measures  H  ,  m-lf2,3 


e«fc-a' 


A 

we  have  confuted  the  matrix  A  *,  m-1,2,3  with  the  help  of  the  extensions  H*  given 

m  ® 

in  (4.9),  (4.10)  ard  (*♦.11).  Although  in  practice  it  is  possible  with  the  help 

2 

of  existing  computer  programme,  to  compute  the  eigenvalues  of  Aaa  based  on  the 
*  H 

estimate  n  -  p..,  the  following  approximation  for  the  asymptotic  distribution  of 

A 

SSB  can  be  used.  Approximating  the  eigenvalues  a^,  i  ■  1,2,. . . ,k-l  by  their  average 
value 


(4.22) 


,  k-1 

8H*  "  k^l  J  6j 

j-1  J 


1 

k-1 


Tr(Ag*>  - 


1 

k-1 


where 


(x) 


ir«x 


i 


the  asymptotic  distribution  of  2n. .  SSB  can  be  approximated  by  the  distribution  of 
—  2 

-Bh*x  (r-l)(k-l).  Further,  if  the  estimator 


(4.23) 


1 

k-1 


Z 


p..idu(p. 


) 


is  a  consistent  estimator  of  8  *,  then  we  shall  have 

H 

* 

(4.24)  n+JS*  ~  x2(r-l)(k-l), 

~8tt* 

see  Naik  (1985).  Light  and  Margolin  (1971)  using  Gini-Simpson  index  of  diversity 
and  Nayak  (1984)  using  quadratic  entropy  of  Rao,  have  found  the  above  approximation 


useful. 
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2  — 

The  matrix  and  the  average  eigenvalue  3^*  corresponding  to  each  of  the 
m  m2 

proposed  diversity  measures  H  .  m»  1,2,3,  along  with  the  elements  of  7  *  are  as 

ID  it 

m 

follows . 


For  m  • 

If 

1,2,3  and  n  e  S  ,  let 

** 

(4.25) 

W  *»]*,. V:>» 

then,  for  m 

zl> 

(4.26) 

di.jj  ■  v'ar1)E3<bi‘*j)’'"2-si} 

(4.27) 

'  v  l"jBj  <br”j  )~Y'2+v,:j  •  (br”i  •  )‘T'2-Si) 

where 

ar  -  k“Y,  bx  -  1+k”1 

2 

B  -  2b  -  (I-y)tt  and  S  -  L 

C  1  t  1  *-lO>fV 

For  m  ■  2 

(4.28) 

d2.]j  •  v«2v»*rlw*i  r3-v- 

(4.29) 

d2,jj'  "  a2Y { 17 j  C j  ^ 2”71  j ^ ”3+7T]  * C j » (b 2_TT j  »)~"S2 } 

where 

a2  -  1-k  Y ,  b2  •  2-k  Y 

1  +  y„ 

IT  C 

C  •  (l-hf)b_-(l-Y)u^  ,  and  S,  -  l  — - 

<VV 


«-  -  vk 


V.V.VsW 
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Finally  for  a  ■  3 . 

(4.30)  d3>jj  -  a3Y{(2irj-l)^“2Dj(b3^r3-S3} 


(4.31)  d3Jj,  -  a3Y{1rj^“2Dj(b3^)‘3-Hrjl^T2Dj,(b3+T»j,)‘3-S3> 


where 


and 


a3  -  (1-k‘V,  b3  -  (1-k‘V,  -  l-i»t, 

Dt  ■  b3(l+r)irt+(l+Y)5r^+(l-Y)if^4nr+b3(l-Y) 


2-y-2 

k  \ 

S.  -  l  - k 


\  L  v  3  ' 

t-10>3+irJ)J 


With  chese  computations,  the  matrix  A_*,  for  j  *1,2,3  can  be  worked  out  as 


(4. 32) 

4*<i>  ■  «w 

l  k 

(4.33) 

V  ■  M 

Case  y  *  1 

For  the  diversity  measures  G 

(15) ,  we  have 


(4.34)  (1):  A*  (*)  -  «d_.,»  .  D 

G"  jj  IT 


—  1  * 

"  k-l  ^  ^11 
U1  K  A  J-l  22 


where 


■'K 


2*  b.  t  — ^ - 3 

<vv 


"1 

<vv3 


*  *  ’N  *4* 

^11'  "  “*1^1  *  .3  +  *  3 

11  O^-ir  )J  Cbj-ir  *  )J 


Z  - - -  1 

L  *  2  I 

<vv 


*  -1  *  -1 
where  a.  •  k  ,  b  ■  1+k 
1  1 


(4.35) 


where 


48*(!>  •  •  D; 


.  .  (2*.-D 

d»  •  Wsg? 


-2  4  3  1 

<VV 


“  2a-)b?l  77 + 


*4  4»  7  2  1  *  3  *  3  *  3 

Jj  *  *  ft.  /K  —ir 


Cb2-ifj)  (b2-*Hj,)' 


<W 


*  _1  *  .1 
-  1-lc  ,  and  *  2-k 


-a*b*  kir.(l-w.) 


(4.36)  ail):  ?Q*  -  rg*  -Id. 
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